[Codex] Add handling for Conversational RAG to Validator API #84

ulya-tkch · 2025-05-30T22:05:10Z

No description provided.

jwmueller · 2025-05-30T23:52:37Z

tests/internal/test_validator.py

@@ -108,3 +112,40 @@ def test_update_scores_based_on_thresholds() -> None:
    for metric, expected in expected_is_bad.items():
        assert scores[metric]["is_bad"] is expected
    assert all(scores[k]["score"] == raw_scores[k]["score"] for k in raw_scores)
+
+
+def test_prompt_tlm_with_message_history() -> None:


add test to confirm there is no query rewriting happening, whenever this is the first user message

add test to confirm that the primary TrustworthyRAG.score(prompt, response) call happens with prompt reflecting the full chat history, not with prompt reflecting the rewritten query.

Confirm you are using this TLM utils method:
cleanlab/cleanlab-tlm@a479e32

to turn the chat history into a prompt string.

src/cleanlab_codex/internal/validator.py

elisno · 2025-06-03T03:11:51Z

src/cleanlab_codex/internal/validator.py

@@ -108,3 +132,38 @@ def is_bad(metric: str) -> bool:
    if is_bad("trustworthiness"):
        return "hallucination"
    return "other_issues"
+
+
+def validate_messages(messages: Optional[list[dict[str, Any]]] = None) -> None:


I think this name validate_messages should be more carefully chosen when the entire validator module reserves the name method validate in Validator for looking at the trustworthiness & Eval scores.

I'd bet we wouldn't change the Validator.validate api, but we could find a different name for validate_messages since it behaves quite differently.

Consider having validate_messages take messages as a required (positional argument):

Suggested change

def validate_messages(messages: Optional[list[dict[str, Any]]] = None) -> None:

def validate_messages(messages: list[dict[str, Any]]) -> None:

Everywhere it's being called, it takes in a messages argument.
The caller already sets a default value for that argument, so I'd advise against setting default values in two function signatures.

src/cleanlab_codex/internal/validator.py

elisno · 2025-06-03T03:26:17Z

src/cleanlab_codex/validator.py

@@ -296,6 +318,25 @@ def _remediate(self, *, query: str, metadata: dict[str, Any] | None = None) -> s
        codex_answer, _ = self._project.query(question=query, metadata=metadata)
        return codex_answer

+    def _maybe_rewrite_query(self, *, query: str, messages: list[dict[str, Any]]) -> str:


This _maybe... prefix implies that we might get something different from the method, other than a string. Should the check for self._tlm be done by the caller?

the maybe is supposed to suggest we might re-write the query or not

src/cleanlab_codex/internal/validator.py

ulya-tkch · 2025-06-09T21:46:13Z

closed because conversational capability moved to the backend

add to validator

4ae1897

ulya-tkch requested a review from elisno May 30, 2025 22:06

ulya-tkch added 4 commits May 30, 2025 15:27

add tlm key check

dd91b4c

add tlm key check test

8e977bf

fix type

ebfefd7

fix tests

e75d38c

jwmueller reviewed May 30, 2025

View reviewed changes

fix tests

fc8dc9d

elisno reviewed Jun 3, 2025

View reviewed changes

jwmueller reviewed Jun 3, 2025

View reviewed changes

src/cleanlab_codex/internal/validator.py Show resolved Hide resolved

ulya-tkch closed this Jun 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Codex] Add handling for Conversational RAG to Validator API #84

[Codex] Add handling for Conversational RAG to Validator API #84

Uh oh!

ulya-tkch commented May 30, 2025

Uh oh!

jwmueller May 30, 2025

Uh oh!

jwmueller May 30, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

elisno Jun 3, 2025 •

edited

Loading

Uh oh!

elisno Jun 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

elisno Jun 3, 2025

Uh oh!

ulya-tkch Jun 9, 2025

Uh oh!

Uh oh!

ulya-tkch commented Jun 9, 2025

Uh oh!

Uh oh!

	def validate_messages(messages: Optional[list[dict[str, Any]]] = None) -> None:
	def validate_messages(messages: list[dict[str, Any]]) -> None:

[Codex] Add handling for Conversational RAG to Validator API #84

[Codex] Add handling for Conversational RAG to Validator API #84

Uh oh!

Conversation

ulya-tkch commented May 30, 2025

Uh oh!

jwmueller May 30, 2025

Choose a reason for hiding this comment

Uh oh!

jwmueller May 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

elisno Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elisno Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elisno Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

ulya-tkch Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ulya-tkch commented Jun 9, 2025

Uh oh!

Uh oh!

elisno Jun 3, 2025 •

edited

Loading

elisno Jun 3, 2025 •

edited

Loading